Resilin: Elastic MapReduce for Private and Community Clouds
نویسندگان
چکیده
The MapReduce programming model, introduced by Google, offers a simple and efficient way of performing distributed computation over large data sets. Although Google’s implementation is proprietary, MapReduce can be leveraged by anyone using the free and open source Apache Hadoop framework. To simplify the usage of Hadoop in the cloud, Amazon Web Services offers Elastic MapReduce, a web service enabling users to run MapReduce jobs. Elastic MapReduce takes care of resource provisioning, Hadoop configuration and performance tuning, data staging, fault tolerance, etc. This service drastically reduces the entry barrier to perform MapReduce computations in the cloud, allowing users to concentrate on the problem to solve. However, Elastic MapReduce is restricted to Amazon EC2 resources, and is provided at an additional cost. In this paper, we present Resilin, a system implementing the Elastic MapReduce API with resources from other clouds than Amazon EC2, such as private and community clouds. Furthermore, we explore a feature going beyond the current Amazon Elastic MapReduce offering: performing MapReduce computations over multiple distributed clouds. Key-words: Cloud computing, MapReduce, Elasticity, Hadoop, Execution platforms ∗ Université de Rennes 1, IRISA, Rennes, France – [email protected] † INRIA Rennes – Bretagne Atlantique, Rennes, France – [email protected] ‡ West University of Timişoara, Timişoara, Romania – [email protected] in ria -0 06 32 04 0, v er si on 1 13 O ct 2 01 1 Resilin: Elastic MapReduce pour nuages informatiques privés et communautaires Résumé : Le modèle de programmation MapReduce, introduit par Google, offre un moyen simple et efficace de réaliser des calculs distribués sur de large quantités de données. Bien que la mise en œuvre de Google soit propriétaire, MapReduce peut être utilisé librement en utilisant le framework Hadoop. Pour simplifier l’utilisation de Hadoop dans les nuages informatiques, Amazon Web Services offre Elastic MapReduce, un service web qui permet aux utilisateurs d’exécuter des travaux MapReduce. Il prend en charge l’allocation de ressources, la configuration et l’optimisation de Hadoop, la copie des données, la tolérance aux fautes, etc. Ce service rend plus accessible l’exécution de calculs MapReduce dans les nuages informatiques, permettant aux utilisateurs de se concentrer sur la résolution de leur problème plutôt que sur la gestion de leur plateforme. Cependant, Elastic MapReduce est limité à l’utilisation de ressources de Amazon EC2, et est proposé à un coût additionnel. Dans cet article, nous présentons Resilin, un système mettant en œuvre l’API Elastic MapReduce avec des ressources provenant d’autres nuages informatiques que Amazon EC2, tels que les nuages privés ou communautaires. De plus, nous explorons une fonctionnalité additionnelle comparé à Amazon Elastic MapReduce: l’exécution de calculs MapReduce sur plusieurs nuages distribués. Mots-clés : Informatique en nuage, MapReduce, élasticité, Hadoop, platesformes d’exécution in ria -0 06 32 04 0, v er si on 1 13 O ct 2 01 1 Resilin: Elastic MapReduce for Private and Community Clouds 3
منابع مشابه
Bringing Elastic MapReduce to Scientific Clouds
The MapReduce programming model, proposed by Google, offers a simple and efficient way to perform distributed computation over large data sets. The Apache Hadoop framework is a free and open-source implementation of MapReduce. To simplify the usage of Hadoop, Amazon Web Services provides Elastic MapReduce, a web service that enables users to submit MapReduce jobs. Elastic MapReduce takes care o...
متن کاملPoster: Cross Cloud MapReduce: an Uncheatable MapReduce
MapReduce [1] is becoming a popular data processing application on Cloud Environment. However, security issues make many customers reluctant to move their critical computation tasks to cloud. For instance, [2] points out a real security vulnerability that the cloud service leader Amazon EC2 suffers from: some members of EC2 can create and share Amazon Machine Image (AMI) to the EC2 community so...
متن کاملA Privacy Data-oriented Hierarchical MapReduce Programming Model
To realize privacy data protection efficiently in hybrid cloud service, a hierarchical control architecture based multi-cluster MapReduce programming model (the Hierarchical MapReduce Model, HMR) is presented. Under this hierarchical control architecture, data isolation and placement among private cloud and public clouds according to the data privacy characteristic is implemented by the control...
متن کاملThe function of resilin in beetle wings.
This account shows the distribution of elastic elements in hind wings in the scarabaeid Pachnoda marginata and coccinellid Coccinella septempunctata (both Coleoptera). Occurrence of resilin, a rubber-like protein, in some mobile joints together with data on wing unfolding and flight kinematics suggest that resilin in the beetle wing has multiple functions. First, the distribution pattern of res...
متن کاملSecurity and Privacy Aspects in MapReduce on Clouds: A Survey
MapReduce is a programming system for distributed processing large-scale data in an efficient and fault tolerant manner on a private, public, or hybrid cloud. MapReduce is extensively used daily around the world as an efficient distributed computation tool for a large class of problems, e.g., search, clustering, log analysis, different types of join operations, matrix multiplication, pattern ma...
متن کامل